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Abstract 

We describe the object-oriented implementation of a higher-order finite- 
difference density-functional code in Fortran 90. Object-oriented models of 
grid and related objects are constructed and employed for the implementa- 
tion of an efficient one-way multigrid method we have recently proposed for 
the density- functional electronic-structure calculations. Detailed analysis of 
performance and strategy of the one-way multigrid scheme will be presented. 
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I. INTRODUCTION 

In recent years, the usefulness of the real-space technique based on three-dimensional 
uniform grid and accurate forms of finite-difference formula has been demonstrated for 
the electronic-structure calculations |1|^- Since ah the computations are performed in 
real space, the major part of the calculations are local operations, which makes the algo- 
rithm easily parallelized and implementation of order-N algorithm transparent. This 
is in contrast to plane wave method methods which rely upon global fast Fourier trans- 
forms. Furthermore, since the Laplacians and potential-wave functions multiplications are 
respectively evaluated by the finite-difference operation on the wave functions and a simple 
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one-dimensional vector multiplications on the fly, explicit storage of the Hamiltonian ma- 
trix elements can be avoided and the matrix diagonalization can be efliciently performed by 
iterative diagonalization methods such as conjugate gradient method. 

Although there has been great emphasis on the algorithmic developments in the real- 
space electronic-structure calculation schemes, we feel that the issue of the code construction 
and organization has been relatively neglected. Recently, we have proposed an efficient and 
easy-to- implement one-way multigrid algorithm which results in significant enhancement in 
computation speed for grid-based iterative electronic-structure calculations 0. In spite of 
the advantage of the multigrid in general P JTI1|JT^ , it requires complications of coding and 
organization of data structure because it involves several grid levels and data transfer be- 
tween them. In such a situation, we found the modern programming construction paradigm, 
object-oriented^ programming |]13[ can be useful. Our original code was written in Fortran 
90 0] with heavy recycling of our previous planewave codes written in Fortran 77, in 
conventional non-object-oriented programming style, with the purpose of typical scientific 
programming, namely the quick implementation of a physical idea. Although we tried to 
keep the code maintainable and clear, it quickly became long and messy with each addi- 
tion of function and implementation of an idea. We recognized that the complexity of the 
conventional programming style is an obstacle, or at least a complication, especially for the 
implementation of multigrid. Hence, we restructured the code by introducing object-oriented 
modeling concepts, and in this work we will report our experience of this transition and the 
implementation of the multigrid method with the newly designed code. Although object- 
oriented programming should be most straightforward in object-oriented languages such as 
C++, it is in principle also possible in non-object-oriented languages |T3|, and especially 
relatively easy with Fortran 90 which supports many ingredients of object-oriented coding. 
Object-oriented scientific programming using Fortran 90 has been of much interest in recent 



years |15|-[T9|], and this paper will add additional information to this discussion. In addition, 
since grid-based simulations are common in other scientific and engineering computations, 
we expect our work is a helpful guide for the code construction in those fields. 

The outline of the current paper is as follows. In Sec. |l|, we first describe the electronic- 
structure calculation scheme within the Kohn-Sham density-functional theory, and our 
methodology based on the higher-order finite difference formulation. In Sec. |T|, we re- 
view the key concepts of object-oriented methodologies, and describe the introduction of 
object-oriented concepts into our grid-based program written in Fortran 90. In Sec. we 
briefiy review the multigrid theory, and describe the one-way multigrid scheme which we 
have recently proposed [§]. Implementation of the one-way multigrid method is discussed, 
and especially the simplification induced by the object-oriented design is emphasized. The 
impressive enhancement of computational efficiency due to the introduction of the one-way 
multigrid method has been demonstrated in our previous publication, and here more de- 
tailed analysis of performance test and multigrid strategy will be reported. The current 
work will be summarized in Sec. 0. 



^Although "object-orientation" is both a language feature and a design methodology, this paper 
is primarily concerned with the design aspect. 
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II. HIGHER-ORDER FINITE-DIFFERENCE KOHN-SHAM 
ELECTRONIC-STRUCTURE CALCULATION METHOD 



The (spin-dependent) Kohn-Sham (KS) density-functional theory (DFT) is an 
independent-electron theory in which one obtains the single-particle wave functions 'i/'cri(r) 
for spin channel cr =t, | and eigenvalues e^i by solving KS equations (Hartree atomic units 
are used throughout the paper) 



cfT,crV 



(1) 



with the spin density 



(2) 



i=l 



where N„ is the number of occupied a spin orbitals. The effective KS potential Kff^^(r) is 
composed of external, Hartree, and exchange-correlation contributions. 



VeZir) = Kxt(r) + Vk(r) + Kc,.(r) 



(3) 



among which Vh(i") and V^c,(T(r) depend on the charge density (and wave functions for 
orbital-dependent Kcc,(T(r) 0), hence Eqs. (|l|),(@), and (H) form a self-consistent system of 
equations. In the usual local density approximation, Vxcai^) is calculated inexpensively, 
hence the solution of KS equations [Eq. ([l|)] and generation of Vn{r) comprise main parts 
of calculations. For localized systems, VH(r) are typically obtained by solving the Poisson 
equation 



VVH(r) = -47m(r) 



(4) 



where ?7,(r) is the total charge density n{r) = n|(r) -|- n^ir). 

In our higher-order finite-difference real-space formulation [0^0 , we discretize both KS 
and Poisson equations on a three-dimensional uniform grid (with grid spacing h) with a 
higher-order finite difference representation (with finite-difference order A^) 



N 



fix) 



C,f{x + jh) + 0{h 

i=-N 



2N+2\ 



(5) 



where {Cj} are constants calculated by the algorithm of Ref. 0]. KS equations have 
been solved by the preconditioned conjugate gradient (CG) method [0,^ supplemented 
by subspace diagonalizations with localize(j^ wavef unctions. The Hartree potential has been 
obtained from the solution of Poisson equation [Eq. (^)] on the entire simulation box, by 
first generating boundary values with multipole expansion, and then propagating solutions 
inside of the box with the combination of the low-order finite-difference (N=l) fast Fourier 



^Vanishing boundary condition is used for wavefunctions. See Sec. [JIB. 
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transform method and the higher-order finite-difference (typically N=5) preconditioned CG 
method. At each self-consistent step we generate a new input Hartree-exchange-correlation 
potential using the simple linear mixing of output and input potentials. The computational 
procedure is summarized in Fig. |l|. Further details of the computation method can be found 
in Ref. |,|]. 

III. FORTRAN 90 IMPLEMENTATION OF OBJECT-ORIENTED CONCEPTS 

A. Object-oriented programming 

Roughly speaking, an object consists of a set of operations on some hidden data. Fol- 
lowing Rumbaugh et al. [|l^, the key components of the object-oriented approach are: 

1. Identity — data is quantized into discrete, distinguishable objects. 

2. Classification — objects with the same data structure and behavior are grouped into 
a class. 

3. Polymorphism — the same operation may behave differently on different classes. 

4. Inheritance — sharing of data structures and behaviors among classes based on a 
hierarchical relationship. 

In addition, there are several themes underlying object-oriented technology which give cor- 
responding benefits: 

• Abstraction — focusing on the essential, inherent aspects of an entity enhances the 
understanding of the problem itself, and preserves the freedom to make decisions as 
long as possible by avoiding premature commitments to details. 

• Encapsulation — separating the external aspects of an object from the internal imple- 
mentation details of the object prevents a program from becoming so interdependent 
that a small change has massive ripple effects. 

• Combining data and behavior — keeping data structure hierarchy identical to the oper- 
ation inheritance hierarchy shifts the burden of deciding what implementation to use 
from the calling code to the class hierarchy. 

• Sharing — sharing of code using inheritance induces savings in code and more impor- 
tantly conceptual simplicity by reducing the number of distinct cases that must be 
understood and analyzed. 

• Emphasis on object structure, not procedure structure — stressing what an object is, 
rather than how is is used, makes the program more stable in the long run, since 
the features supplied by an object are much more stable than the way it is used as 
requirements evolve with time. 

• Synergy — using identity, classification, polymorphism, and inheritance together results 
in usually cleaner, more general, and more robust program. 

These abstract ideas will be made concrete by examples in the next section. 
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B. Fortran 90 implementation of object-orientation 

In this section we describe the object-oriented construction of grid-related objects in 
Fortran 90. Fortran 90 keywords will be denoted as bold uppercase characters, and object- 
oriented concepts relevant to the discussion will be shown in italic characters. Before pro- 
ceeding, we briefly consider the modeling of objects: The most basic components in the 
finite-difference electronic-structure code is the grid, which has the information of grid co- 
ordinates and number of grid panels along the x—,y—, and 2— directions. We choose to 
use a uniform grid along each direction. In actual calculations, however, we employ only a 
localized region in real space to save the memory and enhance the computational efficiency. 
In addition, the grid is apparently constructed in a simulation box with basic information on 
the grid starting and finishing coordinates. So, we actually have a hierarchy of three grid- 
related physical objects whose two-dimensional representations as shown in Fig. ^. Below 
we present the implementations of this hierarchy of concepts using TYPEs of simulation 
box (simbox), grid (grid), and sub-grid (subgrid)^ 

□ MODULE for TYPE simbox and corresponding procedures 

MODULE m.simbox 
IMPLICIT NONE 
PRIVATE 

PUBLIC :: simbox, new, display, .. . 
TYPE simbox 

! Initial/final coordinates of the simulation box 

! along x/y/z-dir. 

REAL :: xi ,xf ,yi ,yf ,zi ,zf 

END TYPE simbox 

INTERFACE new 
MODULE PROCEDURE simbox.construct 

END INTERFACE 

INTERFACE display 
MODULE PROCEDURE simbox_print 

END INTERFACE 

CONTAINS 

SUBROUTINE simbox.construct (xl , x2 , y 1 , y2 , zl , z2 , box) 
! Assign given initial/final coordinates of the simulation 
! box to 'simbox' components. 
REAL, INTENT(IN) :: xl ,x2,yl ,y2,zl ,z2 



^All the program listings in this paper have been simplified from original versions for clarity of 
presentation. Each module includes more subroutines, and double-precision has been used for real 
variables. 



5 



TYPE(simbox) , INTENT(OUT) :: box 

END SUBROUTINE simbox.construct 

SUBROUTINE simbox_print (box , name) 

! Print out simulation box information. 

TYPE(simbox) , INTENT(IN) :: box 

CHARACTER* (*) , INTENT ( IN) , OPTIONAL :: name 

END SUBROUTINE simbox.print 
END MODULE m.simbox 



□ MODULE for TYPE grid and corresponding procedures. 



MODULE m_grid 

USE m_simbox, ONLY: simbox 

IMPLICIT NONE 

PRIVATE 

PUBLIC : : grid, new, delete, display, . . . 
TYPE grid 

! Pointer to simulation box 

TYPE(simbox) , POINTER : : pt.simbox 

! Number of grid panels along x/y/z dir. 

INTEGER : : nx,ny,nz 

! Number of total grid points 

INTEGER : : ngrid 

! Grid spacings along x/y/z dir. 

REAL : : dx , dy , dz 

! grid coordinates along x/y/z dir 

REAL, DIMENSIONC:) , POINTER :: xcrd,ycrd,zcrd 

END TYPE grid 

INTERFACE new 
MODULE PROCEDURE grid.construct 

END INTERFACE 

INTERFACE delete 
MODULE PROCEDURE grid.destruct 

END INTERFACE 

INTERFACE display 
MODULE PROCEDURE grid.print 

END INTERFACE 

CONTAINS 

SUBROUTINE grid.construct (box , nl , n2 , n3 , grd) 
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! For the given simulation box, and the number of grid 
! panels along each direction, assign/construct grid 
! components. 

TYPE(simbox) , INTENT(IN), TARGET :: box 

INTEGER, INTENT(IN) :: nl,n2,n3 

TYPE (grid), INTENT (OUT) :: grd 

! Simulation box coordinates 
grd°/opt_simbox => box 

! Grid coordinates generation. 
ALLOCATE (grdy.xcrd(0:nl) , ... STAT=ierr) 
IF(ierr/=0) ... ! Allocation error handling 

END SUBROUTINE grid.construct 
SUBROUTINE grid.destruct (grd) 
TYPE (grid) : : grd 

! Nullify pointer to 'simbox' 

IF (ASSOCIATED (grd%pt_simbox) ) NULLIFY (grdZpt.simbox) 
! Deallocate 'xcrd' , 'ycrd' , 'zcrd' 
IF (ASSOCIATED (grdZxcrd) ) THEN 

DEALLOCATE (grd^xcrd , STAT=ierr) 

IF(ierr/=0) ... ! Deallocation error handling 
ENDIF 

END SUBROUTINE grid_destruct 

SUBROUTINE grid.print (grd, name) 

! Print out grid information. 
USE m_simbox, ONLY: display 
TYPE(grid), INTENT(IN) :: grd 
CHARACTER* (*) , INTENT ( IN) , OPTIONAL :: name 

CALL display (box) 

END SUBROUTINE grid.print 

END MODULE m_grid 



□ MODULE for TYPE subgrid and corresponding procedures. 

MODULE m.subgrid 

USE m.grid, ONLY: grid 
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IMPLICIT NONE 
PRIVATE 

PUBLIC :: subgrid, new, delete ,.. . 

TYPE subgrid 
! Pointer to 'grid' 
TYPE (grid) , POINTER : : pt.grid 
! Number of sub-grid points 
INTEGER : : nsubgrid 
! local grid index number, 
! for outside of localized region. 
INTEGER, DIMENSIONC: , : , :) , POINTER :: index 

END TYPE subgrid 

INTERFACE new 
MODULE PROCEDURE subgrid.construct 

END INTERFACE 

INTERFACE delete 
MODULE PROCEDURE subgrid.destruct 

END INTERFACE 

CONTAINS 

SUBROUTINE subgr id.construct (grd , sbgrd) 
TYPE(grid), INTENT ( IN) , TARGET :: grd 
TYPE (subgrid) , INTENT(OUT) :: sbgrd 

! Assign pointer to full grid. 
sbgrd%pt_grid => grd 

! Allocate subgrid index array and initialize 
ALLOCATE ( sbgrdy.index (0 : grdy.nx , : grdy.ny , : grd^nz) , & 
STAT=ierr) 

IF(ierr/=0) ! Deallocation error handling 

END SUBROUTINE subgrid.construct 
SUBROUTINE subgrid_destruct (sbgrd) 
TYPE (subgrid) : : sbgrd 

! Nullify pointer to 'grid' , 'pt_grid' 

IF(ASSOCIATED(sbgrdy„pt_grid) ) NULLIFY (sbgrd%pt_gr id) 

! Deallocate 'index' array of type 'subgrid' variable 

IF (ASSOCIATED (sbgrdyoindex) ) THEN 
DEALLOCATE (sbgrdy.index , STAT=ierr) 
IF(ierr/=0) ! Deallocation error handling 

ENDIF 

END SUBROUTINE subgrid.destruct 
END MODULE m.subgrid 
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First, note that in each case we define a new TYPE which consists of corresponding 
variables, such as xi,xf, etc. for simbox and nx,ny,nz, etc. for grid. The abihty to define 
derived TYPEs is a crucial ingredient of object-oriented code construction, [abstraction] By 
using these newly defined TYPEs, we construct/destroy corresponding variables together, 
and especially pass them to procedures (FUNCTIONS and SUBROUTINES) as a single 
argument, which enables the procedure interfaces to be simple and stable, [identity] We 
locate a TYPE definition in the corresponding MODULE to make it globally accessible. 
In addition, note that we hide the implementation details by first declaring all the entities 
in the MODULE as PRIVATE and hst only exceptions as PUBLIC for data hiding, 
[encapsulation] New TYPE definitions made PUBLIC to outside can be USEd in the 
calling routines by including the corresponding module, and individual components of the 
TYPE can be accessed by following the variable by a percentage sign % and the name of 
the component^. Note that, for enhanced safety, we use ONLY qualifier to access public 
entites in the USEd module. 

Secondly, in all the MODULEs, we CONTAIN procedures which operate on the cor- 
responding TYPE definition. Hence when employing each MODULE, the user will access 
the data structure and its behavior at the same time, [classification] Hence at this stage we 
have achieved the basic requirements for the "object-orientation" : we organized program as 
collection of discrete objects that incorporate both data structure (TYPE) and behavior 
(procedures attached to the TYPE by being CONTAINed). 

Next, note that, when we CONTAIN procedures, we employ MODULE PRO- 
CEDURE statement to give them generic names, and make those generic names (in- 
stead of original names of the procedures) be accessible from the outside by giving 
them a PUBLIC attribute. In doing so, we can give different procedures a single 
generic name. For example, we use the same generic name new for different proce- 
dures of TYPEs, simbox (simbox_construct), grid (grid_construct), and subgrid 
(subgrid_construct) . [polymorphism^ 

Finally, in the MODULE m_grid, TYPE grid inherits the simbox information, and 
again this grid information (including that of simbox) is inherited to TYPE subgrid 



^ One can list all the entities in each MODULE as PRIVATE or PUBLIC, or make the default 
PUBLIC and then only list exceptions as PRIVATE. However, for the purpose of data hiding, 



the current form is strongly recommended [14 



^ One can even hide the data components of a derived TYPE to the outside of the MODULE by 
preceding the first component declaration in the derived TYPE definition by PRIVATE attribute. 
In that case, to access the components, it is required to write procedures which manipulate and 
return the components and include them in the same MODULE. 

^ To be more specific, this is static polymorptiism. A good discussion on how to implement a 
run-time polymorptiism can be found in Ref. 0]. 
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in MODULE m_subgrid. [inheritance] So, TYPE grid variables will contain informa- 
tion on the simulation box (xi,xf, etc), and TYPE subgrid variables will contain the 
information on the grid (nx,ny,nz, etc.) and the simbox in which the grid has been 
constructed. Note that for the implementation of this hierarchy structure, we have used 
POINTERS. In Fortran 90, to avoid execution efficiency degradation, all objects to which 
a POINTER may point should have a TARGET attribute. Hence, input variables (IN- 
TENT attribute IN) box in the procedure grid_construct (generic name new) of the 
MODULE m_grid and grd in the procedure subgrid_construct (generic name new) of 
the MODULE m_subgrid have the TARGET attribute. POINTERS have been asso- 
ciated with TARGETS by POINTER assignment statements: 
pointer —> target 

in the procedures grid_construct and subgrid_construct. In addition inheriting data, 
it is also possible for a procedure to inherit another procedure. In our example, a procedure 
grid_print in the MODULE m_grid inherits (by USE statement) another procedure 
simbox_print (generic name display) located in the MODULE m_simbox. Again, note 
that we give a generic name display to both grid_print and simbox_print. 

Now we comment on additional Fortran 90 language features related with our examples. 
First, we do not use a MODULE as a storage place of global variables, although it is pos- 
sible to do so, because we found it is rather clumsy and risky for the large-scale coding due 
to the problem of global storage similar to that arises in the usage of COMMON block in 
Fortran 77. MODULE is exclusively used as a place for TYPE definitions and correspond- 
ing procedures. Next, since ALLOCATABLE arrays cannot be used in derived TYPE 
definitions, we use instead POINTERS (see grid_construct and subgrid_construct). 
Again, in this case of POINTER usage, we arrange the allocation to occur in a well- 
defined corresponding procedure contained in the same module, which makes the usage of 
the derived TYPE more safe and robust. Note that pointer disassociation (NULLIFY) 
and deallocation (DEALLOCATE) are also handled in a similar way (see grid_destruct 
and pwdsubgrid_detruct). 

Before closing this section, we summarize the strategy of the modeling of a new (or the 
remodeling of a present) large-scale code in Fortran 90 with object-oriented concepts, which 
we found useful. 

1. Identify an object and define the corresponding variables as a TYPE. These variables 
are typically global variables, or frequently passed variables from the main program 
to procedures, used together in a procedure in the conventional non- object-oriented 
programming style. Be careful on the hierarchy (dependence) of the objects. 

2. Construct (identify) subroutines closely related with the TYPE. (Re) arrange the pro- 
cedure interfaces (and contents if necessary) using the TYPE definition. 

3. Define a MODULE corresponding to the TYPE, and include the TYPE definition 
from step 1 and CONTAIN procedures identified in step 2 in the MODULE, and 
give them generic names. 

4. Make the TYPE definition and generic names of procedures PUBLIC. 

Remind that the initial stage takes most of the time in the object-oriented approach, which 
is especially true in the remodeling of an existing code, since the remaining process is 
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mostly changing interfaces and variable names in existing procedures and including them in 
MODULES (assuming that the previous code is well-designed, hence the restructuring is 
straightforward). It should be also emphasized that we find this process is actually beneficial 
for doing physics itself, in that the programmer (usually the physicist herself or himself) has 
to consider (reconsider in the case of remodeling of a present code) and identify carefully 
the structures embedded in the problem under consideration, hence results in making one 
focus more on the physical pictures. We refer the reader to Ref. for further discussion 
of the benefits of object-oriented programming. 



IV. OBJECT-ORIENTED IMPLEMENTATION OF MULTIGRID METHODS 

A. Multigrid method 

We first briefly review the theory of the multigrid method p|-[T2|. The fundamental idea 
behind all multigrid methods is to combine computations done on different scales, based 
on the observation that many iterative methods tend to reduce the high-frequency (i.e. 
oscillatory) components of the error rapidly but reduce the low-frequency (i.e. smooth) 
components of the error much slowly. Since the notions of smooth or oscillatory components 
of the error are relative to the mesh on which the solution is defined, in particular, a 
component that appears smooth on a fine grid may appear oscillatory on a coarse grid, 
we can naturally think of using coarser grid to reduce this (now oscillatory) error and 
interpolate back to the fine grid. The key to the successful implementation of a multigrid 
method is the choice of grids of different scales, the strategy to proceed through them, and 
how we move objects among them. Broadly the multigrid algorithms can be divided into 
two basic categories 



1. Correction methods — start at the finest level, and use the coarser levels solely to 
compute a correction which is added to the approximate solution on the finest level. 

2. Nested iteration methods — generate initial guesses on coarser levels and frequently 
reuse coarser levels for corrections also. 

Specific examples of these correction path and nested iteration path are shown in Fig|^(a) 
and (b). Roughly speaking, when a good initial guess is available, a correction algorithm 



can be used, otherwise we should use a nested iteration scheme [12 



B. One-way multigrid method in electronic-structure calculations and its 

implementation 

Multigrid is a quite general concept, and apparently the choice of a specific algorithm 
depends on the nature of the problem under consideration. We have recently demonstrated 
that the introduction of a simple one-way multigrid method (Fig. ^b) greatly improves 
the efficiency of real-space electronic-structure calculations based on the iterative solution 
of KS equations P]. The motivation of our work was based on the observation that the 
most time-consuming part of the self-consistent electronic-structure calculations described 
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in Sec. |T| is the iterative solution of KS equations [0,^]. The sources of this computation 
bottleneck can be traced to broadly two (but closely related) aspects of self-consistent iter- 
ative diagonalization schemes. First of all, in general we do not have a good initial guess of 
wave functions, which generate density, and hence Vu^r) and V^co-lr) in Eq- (H)- So initial 
several self-consistency steps will be used to obtain solutions of biased Hamiltonians, al- 
though they tend to be the most time-consuming part. Secondly, in single iterative solution 
of KS equations, a direct application of a relaxation method on the fine grid has trouble in 
damping out the long-ranged or slowly varying error components in the orbitals. This can 
be understood by the usual spectral analysis of relaxation scheme ITT]], or considering that 



the nonlocal Laplacian operation on a fine grid is physically short-ranged. 

Hence, for our purpose, we seek a multigrid procedure which generates a good initial 
guess for the finest grid calculation and effectively removes long-range error components 
of wave functions in the solution of KS equations [Eq. (|l|)]. While an efficient interpo- 
lation/projection scheme is a crucial ingredient of any successful application of multigrid 
method, we note that it can be also time-consuming and tricky part since in our case we 
need to transfer a large number of wave functions which should observe the orthonormality 
conditions. Hence our strategy, which is the characteristic of the scheme, is to minimize the 
number of data transfer between different grid levels, and employ an accurate interpolation 
method which is very accurate and allow us to use even a noninteger ratio of grid spacings: 
The calculation starts from the coarsest grid 2h, and in each grid-level calculation, Eqs. (|I]) 
and (^) are solved self-consistently as in the usual single-level algorithm shown in Fig. |l]. Af- 
ter each self-consistent calculation on a coarse grid, only wave functions are interpolated to 
the next fine grid, and another set of self-consistent calculation is performed. Since that the 
interpolated wave functions usually do not satisfy the orthonormality condition any more, 
we take an extra Gram-Schmidt orthogonalization process after each orbital interpolation. 
Hence we have n — 1 interpolations and Gram-Schmidt orthogonalization processes for the 
n— level multigrid calculations. For the interpolation, we used a three-dimensional piecewise 
polynomial interpolation with a tensor product of one-dimensional B-splines as the interpo- 
lating function [0,^ . A piecewise cubic polynomials have been taken as B-splines. In Fig. 
^, we summarize the computational procedure for the case of three grids, 2h, 1.5h, and h. 
We refer the reader to Ref. |§] for further discussion of the method and the comparison with 
other multigrid schemes 

Now we describe the implementation of the one-way multigrid algorithm of Fig. ^ using 
the object-oriented components constructed in Sec. |illlj| . The main part of the code is 
shown below. 

SUBROUTINE mg_ks_dft(box,ng, . . .) 
USE m.simbox, ONLY: simbox 
USE ni_grid, ONLY: grid, new, display, delete 
USE m.subgrid, ONLY: subgrid, new, delete 
USE m.wavef unction, ONLY: wavef unction, new, delete 

IMPLICIT NONE 

TYPE(simbox) , INTENT(IN) :: box ! Simulation box information 
INTEGER, INTENT (IN) : : ng(3) ! Number of grid panels 
! at the finest grid 
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! Local variables 

TYPE(grid), ALLOCATABLE, DIMENSION(:) :: grd 

TYPE(subgrid) , ALLOCATABLE, DIMENSION(:) :: sbgrd 

TYPE (wavefunct ion) , ALLOCATABLE, DIMENSION( : ) , TARGET :: wf 



! Allocatable arrays 

! 'nlevel' is number of multigrid level 
ALLOCATE(grd(nlevel) , sbgrd (nlevel) ,wf (nlevel) ,STAT=ierr) 
IF(ierr/=0) ... ! Error handling 

! One-way multigrid loop 
DO ilvl = 1, nlevel 

! Assign number of grid panels for each multigrid level 
! 'ratio' is grid spacing ratio with finest grid as 1 

nl = NINT(ng(l)/ratio(ilvl)) 

n2 = NINT(ng(2) /ratio (ilvl)) 

n3 = NINT(ng(3)/ratio(ilvl)) 

! Generate grid for level='ilvl' 
CALL new(box,nl,n2,n3,grd(ilvl)) 
CALL display (grd(ilvl)) 

! Generate subgrid for level='ilvl' 
CALl new(grd(ilvl),sbgrd(ilvl)) 

! Effective region allocation in the subgrid 



! Generate wavef unctions 
CALL new(. . . ,wf (ilvl)) 

! If level>l, interpolate w.f.(ilvl-l) => w.f.(ilvl) 
IF(ilvl>=2) THEN 

! Interpolate: wf(ilvl-l) => wf(ilvl) 
! and orthonormalize them 



! Destruct: grid(ilvl-l) , subgrid ( ilvl- 1) , wf(ilvl-l). 
CALL delete (grd(ilvl-l)) 
CALL delete (sbgrd(ilvl-l)) 
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CALL delete (wf(ilvl-l)) 
END IF 

! Adjust calculation parameters according to grid level. 



! Solve KS-equations for level='ilvl' 
CALL ksdf t (...) 

! Remove grid(ilvl) , subgrid(ilvl) , wf(ilvl) 

! at the last MG step 

IFCilvl == nlevel) THEN 
CALL delete (grd(ilvl)) 
CALL delete (sbgrd(ilvl)) 
CALL deleteCwf (ilvD) 

END IF 

END DO 

! Deallocate arrays 
DEALLOCATE (grd , sbgrd , wf ) 



END SUBROUTINE mg_ks_dft 



The simpUfication of the coding induced by object-oriented programming style should 
be obvious in this example. Since the simulation box is assigned only once in the current 
method, it has been generated once in the higher level and passed as an input variable, 
and only grids and sub-grids built in the simulation box have been ALLOCATEd for the 
number of multigrid levelsQ In addition to the objects we described in Sec. [IIB , we use 
TYPE wavefunction which has not been described but has been constructed in a similar 
way as others. Again, note that we use the same generic names of procedures for different 
TYPEs, new and delete. It should be also noted that actual allocatable arrays in TYPE 
grid (xcrd, etc.), subgrid (index), and most importantly wavefunction whose size is 
(number of grid points) x (number of states) x (number of spins) are only ALLOCATEd 
when they are required, and those arrays are DE ALLOCATEd as soon as they become 
unnecessary. These processes are elegantly handled by new and delete calls. 



'However, simbox objects can be also frequently generated if we perform adaptive mesh refine- 
ment type calculations. 
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C. Performance test 



For the analysis of performance enhancement due to our multigrid method, we reconsider 
a quasi- two-dimensional quantum dot model that has been employed in Ref. ||^, in which 
a 20-electron quantum dot has been studied with one-level (h), two-level (2/i and h), and 
three-level {2h, 1.5h, and h) methods. Here, a more detailed analysis of performance is 
provided, with varying number of electrons up to 24 and additional four-level (4/i, 2h, 1.5h, 
and h) calculations. While calculations were performed on the entire simulation box with 
the original code in Ref. [Q , here we employ the newly designed object-oriented code which 
use only the grid points inside of a spherical region with a radius 8.0 a*^. Two calculations 
are further different in simulation parameters. 

Quantum dot in GaAs host material (dielectric constant e = 12.9, effective mass m* = 
0.067me) is modeled by an anisotropic parabolic confinement potential Kxt(r) = \^x^'^ + 
^Uy'y'^ + ^u'^z'^ , in which the 2;-axis is taken as the dot growth direction. As in Ref. [Q, we use 
the confinement energies = Uy = 5 meV, and = 45 meV. Our calculations are based on 
the effective mass approximation, and rescaled length and energy units are respectively a*^ 
= 101.88 rA and 10.96 meV. Uniform grid spacing h = 0.3a% with box size 18 x 18 x 18a^^ 
have been used. Incorporation of the spherical local region results in the usage of only about 
35% of total number of grids, hence the number of grid points involved in the calculations 
is 1.2 X 10^ for grid Ah, 1.0 x 10^ for grid 2h, 2.4 x 10^ for grid 1.5/i, and 7.9 x 10^ for 
grid h. Finite-difference order [Eq. (^] for the solution of the KS equations and Poisson 
equation are chosen such that the range of the physical coverage is approximately same, so 
A^ = 5 for h, N = 3 for 1.5h and 2h, and A^ = 1 for 4h. Noninteracting eigenstates (Hermite 
polynomials) are used as an initial guess for the coarsest grid calculation. Spin-unpolarized 
calculations have been performed for the simplicity of performance comparison, although 
the spin-polarized scheme should be employed to observe possible spin-polarized states and 
the corresponding Hund's rule 0. 

We first show the CPU times of self-consistent iterations in 1-, 2-, 3-, and 4-level 24- 
electron calculations in Fig. ^ to contrast the characteristics of self-consistent calculations 
in the conventional 1-level and multigrid methods. The horizontal axis stands for the self- 
consistency iteration index, while the vertical axis is the required computer time for a given 
iteration step. Interpolation and orthonormalization steps in multi-level calculations are 
indicated by downward arrows. While the multigrid calculations requires more number of 
self-consistent iterations in general, they are mostly performed in the coarsest grid, and at 
the finest grid level h we only need two or three iterations, which demonstrates that coarse 
grid calculations provide a good initial solution for the finest grid h calculation and results 
in significant time saving. 

In Fig. 1^, we compare the performance of different multigrid strategies for different 
number of electrons. Note that in general the use of multigrid improves the computation 
speed, and moreover its efficiency increases with the system size. Computation speed-up 
defined as (CPU time for 1-level calculation) / (CPU time for n-level calculation) amounts to 
more than 7 for the 24 electron case with 4-level method. Second, while the 3- and 4-level 
computations are usually better than the 2-level one, its specific performance varies with 
the number of electrons. The rule of thumb is that 4-level method should be used for the 
electron number larger than 20, otherwise 3-level is sufficient. 
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V. CONCLUSIONS 



In the modern computation era when the increase of computational capabihty increases 
almost exponentially with time, it is clear that physicists can attack more ambitious prob- 
lems requiring more challenging large-scale computations. However, with the growth of 
the size of the problem, typically the complexity of the problem itself, hence the compli- 
cation of the code also increases. Object-oriented methodology can be a valuable solution 
to this problem of complexity of modern scientific computations, and, in this paper, we 
showed one example of the application of the object-orientation methodology to the large- 
scale code implementation in Fortran 90. Specifically, we treated a real-space grid-based 
electronic-structure program which solves the KS and Poisson equations self-consistently, 
and especially explained how we have implemented the one-way multigrid method we have 
recently proposed using the object-oriented techniques. According to our experience, we 
believe that it pays to write a scientific program in object-oriented fashion in the long run, 
and further the cost we have to pay is minimal compared with its benefits even when using 
a non-object-oriented language Fortran 90. 
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FIGURES 



FIG. 1. Flowchart of the current higher-order finite-difference electronic-structure calculations 
based on the iterative CG diagonalization. Only the dotted parts are repeated in the higher-level 
multigrid calculations shown in Fig. |^. 

FIG. 2. Two-dimensional representation of the hierarchy of three physical objects for grid-based 
calculations: (a) simulation box, (b) grid, and (c) sub-grid. Only filled circles in (c) are actually 
used for computations. 

FIG. 3. Examples of multigrid algorithmic flow: (a) correction path, and (b) nested iteration 
path. Level 3 has the finest grid, level 1 the coarsest; computation flows from left to right. 

FIG. 4. Schematic diagram of the present one-way multigrid algorithm for the case of 
three-level {2h, 1.5/i, and h) calculations. The calculation starts at the coarsest level (level 1, 
2h) at the bottom, and ends at the finest grid (level 3, h) at the top. At level 2 and 3, only the 
dotted parts of the self-consistent calculation in Fig. || are performed. Orbital interpolation and 
orthogonalization step is taken after each coarse grid (level 1 and 2) calculation. 

FIG. 5. CPU time vs. self-consistent iteration number of twenty-four-electron quantum dot 
calculations in (a) one-level (h), (b) two-level {2h and h), (c) three-level (2/i, 1.5/i, and h), and 
(d) four-level {4h, 2h, 1.5h, and h) schemes. Downward arrows in (b), (c), and (d) indicate 
interpolation-orthonormalization steps. Total computation time is (a) 59.5, (b) 12.9, (c) 11.0, and 
(d) 8.3 minutes. Calculations are performed on DEC alpha 500au workstations. 

FIG. 6. Comparison of the computational efficiency enhancement in n— level one-way multigrid 
methods, where n is 2 (2/i and h), 3 {2h, 1.5h, and h), and 4 (4/i, 2h, 1.5h, and h), for electron 
number 8, 12, 16, 20, and 24. Computation speed-up has been defined as (CPU time for 1-level 
calculation) /(CPU time for n-level calculation). Total computation time of 1-level calculation has 
been denoted in minutes for each electron number. Calculations are performed on DEC alpha 
500au workstations. 
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